{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ba450fb1-8a26-4894-ab7a-5d7bfefe90ce",
   "metadata": {},
   "source": [
    "<table style=\"width:100%\">\n",
    "<tr>\n",
    "<td style=\"vertical-align:middle; text-align:left;\">\n",
    "<font size=\"2\">\n",
    "Supplementary code for the <a href=\"http://mng.bz/orYv\">Build a Large Language Model From Scratch</a> book by <a href=\"https://sebastianraschka.com\">Sebastian Raschka</a><br>\n",
    "<br>Code repository: <a href=\"https://github.com/rasbt/LLMs-from-scratch\">https://github.com/rasbt/LLMs-from-scratch</a>\n",
    "<br>汉化的库: <a href=\"https://github.com/GoatCsu/CN-LLMs-from-scratch.git\">https://github.com/GoatCsu/CN-LLMs-from-scratch.git</a>\n",
    "</font>\n",
    "</td>\n",
    "<td style=\"vertical-align:middle; text-align:left;\">\n",
    "<a href=\"http://mng.bz/orYv\"><img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/cover-small.webp\" width=\"100px\"></a>\n",
    "</td>\n",
    "</tr>\n",
    "</table>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "51c9672d-8d0c-470d-ac2d-1271f8ec3f14",
   "metadata": {},
   "source": [
    "# Chapter 6 练习"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5fea8be3-30a1-4623-a6d7-b095c6c1092e",
   "metadata": {},
   "source": [
    "## Exercise 6.1: 增长长度"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5860ba9f-2db3-4480-b96b-4be1c68981eb",
   "metadata": {},
   "source": [
    "我们可以通过将最大长度设置为 1024 来填充输入至模型支持的最大token数：\n",
    "\n",
    "```python\n",
    "max_length = 1024\n",
    "\n",
    "train_dataset = SpamDataset(base_path / \"train.csv\", max_length=max_length, tokenizer=tokenizer)\n",
    "val_dataset = SpamDataset(base_path / \"validation.csv\", max_length=max_length, tokenizer=tokenizer)\n",
    "test_dataset = SpamDataset(base_path / \"test.csv\", max_length=max_length, tokenizer=tokenizer)\n",
    "```\n",
    "\n",
    "或者，我们也可以通过以下方式定义 max_length：\n",
    "\n",
    "``` python\n",
    "max_length = model.pos_emb.weight.shape[0]\n",
    "```\n",
    "\n",
    "or\n",
    "\n",
    "```python\n",
    "max_length = BASE_CONFIG[\"context_length\"]\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2b0f4d5d-17fd-4265-93d8-ea08a22fdaf8",
   "metadata": {},
   "source": [
    "为了方便起见，您可以通过以下命令运行此实验：\n",
    "```bash\n",
    "python additional-experiments.py --context_length \"model_context_length\"\n",
    "```\n",
    "使用 ../02_bonus_additional-experiments 文件夹中的代码，这将导致测试准确率大幅下降，达到 78.33%（而在主章节中为 95.67%）。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a780455-f52a-48d1-ab82-6afd40bcad8b",
   "metadata": {},
   "source": [
    "## Exercise 6.2: 微调整个模型"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56aa5208-aa29-4165-a0ec-7480754e2a18",
   "metadata": {},
   "source": [
    "我们可以通过移除以下代码行来微调整个模型，而不仅仅是微调最后一个 Transformer 层：\n",
    "```python\n",
    "for param in model.parameters():\n",
    "    param.requires_grad = False\n",
    "```\n",
    "\n",
    "为了方便起见，您可以通过以下命令运行此实验：\n",
    "\n",
    "```bash\n",
    "python additional-experiments.py --trainable_layers all\n",
    "```\n",
    "\n",
    "用这儿的的代码[../02_bonus_additional-experiments](../02_bonus_additional-experiments) 可以提升一个点!"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2269bce3-f2b5-4a76-a692-5977c75a57b6",
   "metadata": {},
   "source": [
    "## Exercise 6.3: 微调第一个标记与最后一个标记"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7418a629-51b6-4aa2-83b7-bc0261bc370f",
   "metadata": {},
   "source": [
    "我们可以通过将以下代码中的最后一个输出标记改为第一个输出标记来微调第一个标记：\n",
    "\n",
    "```python\n",
    "model(input_batch)[:, -1, :]\n",
    "```\n",
    "\n",
    "to\n",
    "\n",
    "```python\n",
    "model(input_batch)[:, 0, :]\n",
    "```\n",
    "\n",
    "在代码的所有相关位置进行修改。\n",
    "\n",
    "为了方便起见，您可以通过以下命令运行此实验：\n",
    "\n",
    "```\n",
    "python additional-experiments.py --trainable_token first\n",
    "```\n",
    "\n",
    "[../02_bonus_additional-experiments](../02_bonus_additional-experiments) 这儿的代码将导致测试准确率大幅下降，降至 75.00%（而在主章节中为 95.67%）。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
